Lab 2
Advanced Data Visualization
Instructions
Create a Quarto file for ALL Lab 2 (no separate files for Parts 1 and 2).
- Make sure your final file is carefully formatted, so that each analysis is clear and concise.
- Be sure your knitted
.htmlfile shows all your source code, including any function definitions.
Part One: Identifying Bad Visualizations
If you happen to be bored and looking for a sensible chuckle, you should check out these Bad Visualisations. Looking through these is also a good exercise in cataloging what makes a visualization good or bad.
Dissecting a Bad Visualization
Below is an example of a less-than-ideal visualization from the collection linked above. It comes to us from data provided for the Wellcome Global Monitor 2018 report by the Gallup World Poll:
- While there are certainly issues with this image, do your best to tell the story of this graph in words. That is, what is this graph telling you? What do you think the authors meant to convey with it?
It appears that this image is trying to represent the proportions of people in each country that answered affirmatively to the statement “Vaccines are safe”. That data come from the year 2018, and are grouped by global region. We can see that the median affirmative answer in each global region increases from the bottom of the plot to the top.
- List the variables that appear to be displayed in this visualization. Hint: Variables refer to columns in the data.
Variables include:
- Percentage of people who believe that vaccines are safe
- Global region
- Region medians
- Countries
- Now that you’re versed in the grammar of graphics (e.g.,
ggplot), list the aesthetics used and which variables are mapped to each.
The aesthetics map to variables in the following ways:
xis mapped to proportion of the population that believes that vaccines are safeyis mapped to…nothing?coloris mapped to goblal regionlabelis mapped individual country names- Each point represents the proportion of a country’s pro-vacc’ers, and is drawn with
geom_point()
- Vertical lines are added using
geom_vline()to show regional medians, which increase as one looks higher in the plot
- What type of graph would you call this? Meaning, what
geomwould you use to produce this plot?
This appears to be a scatterplot that also creates a quasi-faceting effect by grouping countries based on region, and then separating them vertically depending on the median proportion of belief in vaccine health in each global region. I would use geom_point() to create this plot.
- Provide at least four problems or changes that would improve this graph. Please format your changes as bullet points!
Four ways to improve this plot are:
- Eliminate the legend
- Double-code the points to further distinguish them beyond color
- Eliminate the appearance of the y-axis in each facet representing something quantitative
- Make points clickable so that one can see proportions for individual countries
Improving the Bad Visualization
The data for the Wellcome Global Monitor 2018 report can be downloaded at the following site: https://wellcome.ac.uk/reports/wellcome-global-monitor/2018
There are two worksheets in the downloaded dataset file. You may need to read them in separately, but you may also just use one if it suffices.
- Improve the visualization above by either re-creating it with the issues you identified fixed OR by creating a new visualization that you believe tells the same story better.
Part Two: Broad Visualization Improvement
The full Wellcome Global Monitor 2018 report can be found here: https://wellcome.ac.uk/sites/default/files/wellcome-global-monitor-2018.pdf. Surprisingly, the visualization above does not appear in the report despite the citation in the bottom corner of the image!
Second Data Visualization Improvement
For this second plot, you must select a plot that uses maps so you can demonstrate your proficiency with the leaflet package!
Select a data visualization in the report that you think could be improved. Be sure to cite both the page number and figure title. Do your best to tell the story of this graph in words. That is, what is this graph telling you? What do you think the authors meant to convey with it?
List the variables that appear to be displayed in this visualization.
Now that you’re versed in the grammar of graphics (ggplot), list the aesthetics used and which variables are specified for each.
What type of graph would you call this?
List all of the problems or things you would improve about this graph.
Improve the visualization above by either re-creating it with the issues you identified fixed OR by creating a new visualization that you believe tells the same story better.
Agreement vs Disagreement that Technology Will Increase the Number of Jobs in My Country in the Next Five Years
Third Data Visualization Improvement
For this third plot, you must use one of the other ggplot2 extension packages mentioned this week (e.g., gganimate, plotly, patchwork, cowplot).
Select a data visualization in the report that you think could be improved. Be sure to cite both the page number and figure title. Do your best to tell the story of this graph in words. That is, what is this graph telling you? What do you think the authors meant to convey with it?
List the variables that appear to be displayed in this visualization.
Now that you’re versed in the grammar of graphics (ggplot), list the aesthetics used and which variables are specified for each.
What type of graph would you call this?
List all of the problems or things you would improve about this graph.
Improve the visualization above by either re-creating it with the issues you identified fixed OR by creating a new visualization that you believe tells the same story better.